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ABSTRACT 


Customer Relationship Management (CRM) system is an information 
management and analysis tool that can help businesses and other 
organizations manage their interactions with customers. CRMs were 
originally designed to target large corporations, but the internet has 
allowed small business owners to take advantage of these tools as 
well. Customer data is collected in a CRM database, which allows for 
advanced analysis such as customer segmentation and contact 
history. Customer relationship management system (CRMs) is a 
process in which a business or other organization administers its 
interactions with customers, typically using data analysis to study 
large amounts of information. In this article, we will be explaining 
how you can a E-commerce company can apply their customer 
relationship management system to analyze their customer base by 
CLTV, a key marketing metric that allows you to evaluate the impact 
and outcomes of the firm’s customer relationship management 
strategies and tactics. In order to increase revenue through better 
marketing campaigns. E-commerce companies consider that 
customers are their most important asset and that it is essential to 
estimate the potential value of this asset. Hence, a model for 
calculating customer's value is essential in these domains. We 
describe a general modeling approach, based on BG-NBD and 
Gamma-Gamma models, for calculating customer value in the e- 
commerce domain. This model extends existing models from the 
field of direct marketing, by taking into account a sample set of 
variables required for evaluating customers value in an e-commerce 
environment. In addition, we present an algorithm for generating this 
model from historical data, as well as an application of this modeling 
approach for the creation of a model for e-commerce. This model 
provides more accurate predictions than existing models regarding 
the future income generated by customers using Python. 
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The segmentation of customers according to their 
customer lifetime value (CLTV) enables companies to 
adequately build long-term relationships with 
customers and effectively manage investments into 
marketing tools. CLTV contributes to solving a 
number of problems such as decisions related to 
addressing, retaining and acquiring customers, or 
issues concerning a company’s long-term value 
(Haenlein et al., 2006). Many different CLTV models 
were devised in recent decades and, at the same time, 
the development of ICT gave rise to e-commerce, 
which is a fast-growing retail market in world. The 
important part of e-commerce is online shopping, 
which offers retail sales directly to consumers. 


Companies engaged in e-commerce have high data 
availability due to the interactions of customers with 
their websites and other Internet-based services. The 
high level of competition, especially in online 
shopping, drives companies to spend their financial 
resources on marketing activities as efficiently as 
possible, which can be helped by implementing a 
CLTV model that uses available historical data to 
estimate customer value. However, in their effort to 
introduce CLTV as a decision-making basis for 
marketing management, companies operating an 
online store face the issue of selecting the appropriate 
CLTV model that would be suitable for their kind of 
business. 
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Customers are central to all marketing activities of a 
company because not only do they generate income, 
but they increase the company’s market value as well. 
Marketing emphasizes the interconnection of all 
processes and activities that create, communicate and 
provide values for customers, including customer 
relationship management. 


In the past two decades, the field of customer 
relationship management (CRM) went through a 
significant transformation thanks to information and 
communication technologies (mainly database and 
analytical technology). When analyzing customer 
feedback, companies no longer have to rely only on 
the aggregated results of quantitative and qualitative 
research (e.g., questionnaires, focus groups), but they 
can use their own customer data and concentrate on 
selected groups or individual customers. This was 
achieved thanks to the new possibilities of storing and 
processing available data about individual customers 
(Jasek et al., 2018). 


The CLTV approach forms a bridge between 
marketing and financial metrics, which means that 
marketing activities are always related to financial 
metrics, allowing space for optimization and 
management (Williams et al., 2015). CLTV shows the 
way in which (changes in) customer behavior (e.g., 
increased purchase, retention) can influence future 
profitability. The relevancy of CLTV applications is 
leveraged mainly by customer behavior impacting 
retention, customer-level attributes impacting 
customer loyalty (e.g., age and gender), and national 
cultural dimensions affecting the drivers of purchase, 
frequency and contribution margin. All of these (and 
other) components used for appropriate CLTV models 
with available data constitute both direct and indirect 
influences on CLTV _ calculations. The main 
researched applications of CLTV are aimed at the 
business-to-consumer context while the business-to- 
business applications are focused on customer asset 
management(Nenonen et al., 2016). 


Customer Relationship Management (CRM): 
Customer relationship management (CRM) is a 
technology for managing all your company’s 
relationships and interactions with customers and 
potential customers. The goal is simple: Improve 
business relationships. A CRM _ system helps 
companies stay connected to customers, streamline 
processes, and improve profitability. 


When people talk about CRM, they are usually 
referring to a CRM system, a tool that helps with 
contact management, sales management, productivity, 
and more. 


A CRM solution helps you focus on your 
organization’s relationships with individual people — 
including customers, service users, colleagues, or 
suppliers — throughout your lifecycle with them, 
including finding new customers, winning their 
business, and providing support and additional 
services throughout the relationship. 


Customer Lifetime value (CLTV): 

Customer Lifetime value (CLTV) is a key marketing 
metric that allows you to measure the impact and 
outcomes of the firm’s customer relationship 
management strategies and tactics. CLTV models are 
used in the field of marketing to evaluate the lifetime 
value of customers in conventional businesses. 


Customer lifetime value (CLTV), is the prediction of a 
company's net profit contributed to its overall future 
relationship with a customer. The model can be simple 
or sophisticated, depending on how complex the 
predictive analytics techniques are. 


Lifetime value is a critical metric because it represents 
the maximum amount that customers may be expected 
to spend in order to acquire new ones. As a result, it's 
crucial in determining the payback of marketing 
expenses used in marketing mix modeling. 


It allows companies to know exactly how much each 
customer is worth in monetary terms and therefore 
exactly how much a marketing department should be 
willing to spend to acquire each customer. It is a 
concept adopted from direct marketing which looks on 
the long term customer behavior as the key for 
success. The norms for this success are based on 

> The cost of acquiring a new customer and 

> The benefits & costs of retaining an existing 

customer 


Relationship marketing is the key concept that helps 
the companies in developing loyal consumer base. It 
embraces all those steps that companies undertake to 
know and provide value to its customers It is more 
profitable to have a set of regular long-term loyal and 
profitable customers than to have more number of 
customers. The 20:80 rule of marketing is that 20 % 
of the customer’s account for 80% of the company's 
profits and it is much cheaper to retain a consumer 
than to attract a new one. To reach these customers 
and convert them into partners is the real challenge 
(Ramachandran, et al., 2006). 


CLTV Models: 

There are several ways (models) you can use to 
calculate the numbers mentioned above. It’s 
important to understand that different models use and 
gather different past data a little bit differently. Some 
are simpler and offer a more general number; others, 
like the machine learning models, use additional data 
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and a robust algorithm to give you a much more 
complete overview of your customers and _ their 
particularities. 


Let’s dive into those models a little bit further: 

A. Aggregate Model 

The Aggregate Model is probably the most common 
method out there. It has been around the longest, and 
itis the most straightforward way to calculate CLTV. 
The aggregate model uses a constant spend rate and 
churn for all clients. In this method, we have a single 
CLTYV, or in other words, a single group of customers 
rather than individuals. 


Since this model creates one single CLV predicted 
value, there might be some drawbacks to using it. 
Because all of your customers are grouped together, 
you might see a higher customer churn due to 
seasonality or a higher monetary value for 
transactions because of a few “big spenders” that 
influence the overall value. 


B. Cohort Model 

As the name says, this model uses cohorts to group 
customers and then calculates the CLTV for each 
group. The preposition here is based on the principle 
that customers grouped in a cohort have the same 
spending patterns or fairly similar behavior patterns. 


The algorithm will create cohorts based on your 
customers’ start date (by month). We usually consider 
the months of the year to group clients since each 
month usually has different marketing campaigns 
targeted to reach different kinds of people that, 
therefore, might be characterized by different 
behavior patterns. 


C. Probabilistic Model 

There are several probabilistic models that are 
commonly used to calculate CLTV predictions. They 
all use the same data mentioned before (Average 
Order Value, Purchase Frequency, Customer Churn), 
but they have slightly different approaches and 
probability distribution. 


Some of the models often used by companies are: 
Pareto/NBD Model 

BG/NBD Model 

MBG/NBD Model 

BG/BB Model 

Gamma-Gamma Model 


VVVVV 


D. Machine Learning Models 

Machine Learning (ML) is an essential Artificial 
Intelligence (AI) tool that can help predict CLTV 
with great accuracy. This is because the ML models 
use algorithms that find patterns in the data you’ ve 
collected to more accurately forecast future customer 
behaviors — which has enormous benefits. 


As part of the machine learning process, we will also 
estimate the Recency, Frequency, and Monetary 
Value (REM) for the transactions. This helps give you 
a clear overview of each user’s average purchase 
amount, lifetime duration, and their frequency of 
purchase. 


CLTV models in E-commerce: 

For this type of CLTV model to work, you need to 
have previous data and prepare it so the algorithm can 
do its job. This process involves removing duplicates 
and getting rid of empty fields or data that are 
incorrectly formatted. In the end, it doesn’t matter 
which CLTV model you choose; you will need to 
prepare and clean up your data first. 


The past three decades saw the introduction of a vast 
number of different models and approaches to 
calculating CLTV designed for various types of 
companies, businesses or chosen management views. 
One of the possible and often mentioned divisions of 
CLTV models according to the customer-company 
relationship is into contractual relations (lost for 
good, retention), semi-contractual relations and non- 
contractual relations (always a share, migration). 


Within the literature were found only two studies in 
the Web of Science, which include a greater number 
of comparisons of selected models for the calculation 
of CLTV based on their empirical research, and 
therefore a comparison of the predictive capabilities 
of selected CLTV models on a single dataset on the 
basis of statistical metrics. (Donkers et al.,2007). 
analyzed a dataset from an insurance company with 
contractual settings and concluded that simple profit 
regression models achieve the best performance 
(Batislam et al.,2007). used a dataset from a grocery 
retailer repeatedly focusing on store cards and their 
usage as the drivers of higher purchase frequency by 
customers. The results confirm the better performance 
of their own modified Beta Geometric/NBD model 
(BG/NBD) customized to the specified business 
settings in comparison with Pareto/NBD and original 
BG/NBD models. 


It can be stated that even simple models achieve 
excellent prediction results despite the more complex 
models being expected to capture the depth of 
relationship developments better. Similarly, it can be 
expected that modified models or those designed for 
specific conditions and environment will produce 
better predictions in relevant cases than more 
complex, universally applicable models (achieving 
consistently good results in various situations). This 
article focuses on non-contractual relations typical for 
e-commerce companies engaged in online shopping. 
Such companies usually have at their disposal an 
extensive database concerning their customers, which 
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they use for internal purposes (e.g., financial 
management, marketing). This kind of online retail 
market, focusing on selling to end customers, has 
been growing continuously and it can thus be 
expected that the number of Internet-based services 
such as online stores will increase. The same applies 
to the competitive pressure put on them. The focus on 
e-commerce companies engaged in online shopping is 
therefore very topical both in local and global 
context. 


NEED OF THE STUDY 

>» The study is to evaluate CLTV to make better 
decisions on (CAC) Customer acquisition costs, 
Improved forecasting, Improving profitability & 
Strategic marketing practices. 

> To make Customer segmentation based on RFM 
metrics. 


OBJECTIVES OF THE STUDY 

> To Segment Customers Based on RFM Scores. 

> To Segment Customers Based on CLTV. 

> To analyse the CLTV in improving Customer 
Retention and avoiding attrition. 

> To Forecast the CLTV to make better decisions on 
(CAC)Customer acquisition costs. 


SCOPE OF THE STUDY 

>» The study is confined on modeling Customer 
lifetime value (CLTV) evaluation, marketing 
metric that projects the value of a customer over 
the entire history of that customers relationship 
with a E-Commerce company using BG-NBD 
statistical models in Python. 


LIMITATIONS OF THE STUDY 

> Inaccuracy of Data can lead to Misleading results. 

> The study is confined on CLTV modeling using 
BG-NBD statistical models in Python. 

> This study is confined to 45days only. 


METHODOLOGY OF THE STUDY 


This study is entirely based on Secondary Data 
Analysis, through research papers, journals, articles, 
websites etc. 


Secondary Data Collection: 
Sample E-commerce data-set is gathered from a 
secondary data source through internet. 


Source: UCI Machine Learning Repository. 
Website: https://archive.ics.uci.edu/ml/index.php 


TOOLS AND TECHNIQUES 
Software Used: 

> Python 3.10 

> Jupyter Notebook 

> MS Excel 


Statistical Models: 
BG-NBD and Gamma-Gamma models. 


RESEARCH DESIGN 
Six Phases Of CLTV Modelling: 


Formulation 
of research objectives Model selection 


and jastification 3 


Data understanding 


Data preparation 


Data Analysis 


Discussion of results 


Figure 1: Research Phases CLTV Modelling. 


Model Selection And Justification: 

According to (Farris et al., 2008), there are several 
concepts in measuring these values are Customer 
Profitability Analysis (CPA), Recency and Retention 
Rate Analysis, and Customer Lifetime Value (CLTV). 
Researchers recommend CLTV as a metric for 
selecting customers and designing marketing 
programs,(Reinartz and Kumar, et al., 2003) and (Rust 
et al., 2004). 


The comparison of CLTV predictive abilities, using 
selected evaluation metrics, is made on selected 
CLTV models: Extended Pareto/NBD model 
(EP/NBD), Markov chain model and Status Quo 
model. The article uses six online store datasets with 
annual revenues in the order of tens of millions of 
euros for the comparison. The EP/NBD model has 
outperformed other selected models in a majority of 
evaluation metrics and can be considered good and 
stable for non-contractual relations in online 
shopping(Jasek et al., 2018). 


DATA COLLECTION 

INTERPRETATION: 

DATA UNDERSTANDING: 

Dataset: 

> The dataset includes Sample sales between 
01/12/2009 - 09/12/2011. 

> In this article, the years 2010-2011 will be 
examined. 

> The product catalog of this company includes 
souvenirs. 

> The vast majority of the company's customers are 
corporate customers. 


ANALYSIS & 


Variables: 
> InvoiceNo: Invoice number. The unique number 
of each transaction, namely the invoice. Aborted 
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operation if it starts with C. 


> StockCode: Product code. Unique number for 
each product. 


> Description: Product name 


> Quantity: Number of products. It expresses how 
many of the products on the invoices have been 
sold. 


InvoiceDate: Invoice date and time. 


UnitPrice: Product price. 


VV WV 


CustomerID: Unique customer number. 
> Country: The country where the customer lives. 


DATA PREPARATION 

Steps For Data Preparation: 

Installing Required Python Libraries. 
Load and Check Data. 

Data Pre-processing. 

Removing Null-Values. 

Outlier Observations. 

Exploratory Data Analysis 


WN ADD 


A. Installing Required Python Libraries: 
Python Code For PIP Installing Packages: 
> pip install Lifetimes 


Output: 
irwoiceNo StockCode Description | Quantity 


0 536365 85123A WHITE 
HANGING 
HEART T- 
LIGHT 
HOLDER 


1 536365 71053 WHITE 
METAL 


LANTERN 


CREAM 
cuPiD 
HEARTS 
COAT 
HANGER 
KNITTED 
UNION 
FLAG HOT 
WATER 
BOTTLE 


RED 
woouy 
HOTTIE 
WHITE 
HEART. 


2 536365 B44068 


3 536365 


840296 


“ 536365 BAOZIE 


invoiceDate | UnitPrice 


6 


> pip install openpyxl 

pip install SQLAIchemy 
pip install -U scikit-learn 
pip install squarify 

pip install seaborn 

pip install matplotlib 


ase Code For Installing Libraries: 
from sklearn.preprocessing import MinMaxScaler 


VVVVV 


4 from sqlalchemy import create_engine 

> from lifetimes import GammaGammaFitter 

> from lifetimes import BetaGeoFitter 

> from lifetimes. plotting import 
plot_period_transactions 

> import datetime as dt 

> import pandas as pd 

> import seaborn as sns 

> import matplotlib.pyplot as plt 

> import squarify 

> import warnings 

> warnings. filterwarnings("ignore") 


B. Load and Check Data: 

Python Code For Load Data: 

> df=pd.read_excel("c:\\Users\\MyPC\\Downloads\\ 
Online Retail.xlsx") 

> df.head() 


| CustomerID | Country 


2010-12-01 2.55 17850.0 United 
08:26:00 Kingdorn 
2010-12-01 3.39 17850.0 United 
08:26:00 Kingdom 
2010-12-01 2.75 17850.0 United 
08:26:00 Kingdom 
2010-12-01 3.39 17850.0 United 
08:26:00 Kingdom 
2010-12-01 3.39 17850.0 United 
68:26:00 Kingdom 


Table 1: First 5 Instances Of Imported Data-Set. 


Python Code For Check Data: 
df = df[~df["Invoice" ].str.contains("C", na=False)] 
df.shape 

def check_df(dataframe): 
print(dataframe.shape) 
print(dataframe.columns) 
print(dataframe.dtypes) 
print(dataframe.head()) 
print(dataframe.tail()) 
print(dataframe.describe().T) 
check_df(df) 


VVVVVVVV VV 


C. Data Pre-processing: 

1. Removing Null-Values: 

Python Code For Removing Null-Values: 
> df.isnulld).sumQ 

> df.dropna(inplace=True) 

> = df.isnull().sum() 


2. Outlier Observations: 
Python Code For Outlier Observations: 
> def outlier_thresholds(dataframe, variable): 


> quartile] = dataframe[variable].quantile(0.01) 
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quartile3 = dataframe[variable].quantile(0.99) 
interquantile_range = quartile3 - quartilel 
up_limit = quartile3 + 1.5 * interquantile_range 
low_limit = quartile1 - 1.5 * interquantile_range 
return low_limit, up_limit 


def replace_with_thresholds(dataframe, variable): 


VV VV VV WV 


low_limit, up_limit = 
outlier_thresholds(dataframe, variable) 


> dataframe.loc[(dataframe[variable] < low_limit), 
variable] = low_limit 


> dataframe.loc[(dataframe[variable] > up_limit), 
variable] = up_limit 

> replace_with_thresholds(df, "Quantity") 

> replace_with_thresholds(df, "Price") 


3. Exploratory Data Analysis: 

Python Code for Categorical Variables: 

> cat_cols = [col for col in df.columns if 
df[col].dtypes =="O"] 


> cat_but_car = [col for col in df.columns if 
df[col].nuniqueQ) > 100 and df{[col].dtypes == 
"O"] 

> cat_cols = [col for col in cat_cols if col not in 
cat_but_car] 

> cat_cols 


Output: ['Country'] 


Python Code for Summarizing Categorical 

Variables: 

> def cat_summary(dataframe, col_name, 
plot=False): 


> print(pd.DataFrame({col_name: 
dataframe[col_name].value_counts(), 


> "Ratio": 100 = 
dataframe[col_name].value_counts() / 
len(dataframe)})) 


print ("HRB") 
if plot: 

fig_dims = (15, 5) 

fig, ax = plt.subplots(figsize=fig_dims) 


VV VV WV 


sns.countplot(x=dataframe[col_name], 
data=dataframe) 


> pit.xticks(rotation = 45, ha = 'right’) 
> plt.show() 
> cat_summary(df, "Country", plot=True) 


Output: 


Figure 2: Summarizing Categorical Variable 
Country. 


Interpretation: 

The above Bar plot represents the Count of 
Categorical Variables (Country) where we can 
observe United kingdom stands first with highest 
number of transactions, Austria stands last with lowest 
number of transactions available in Dataset provided. 


Python Code For Numerical Variables: 


> num_cols = [col for col in df.columns if 
df[col].dtypes != 'O' and col not in "Customer 
ID"] 

> num_cols 

Output: 


[‘Quantity', nvoiceDate'’, 'Price’] 
Python Code for Summarizing Numerical 
Variables: 


> def num_summary(dataframe, 
plot=False): 


> quantiles = [0.05, 0.10, 0.20, 0.30, 0.40, 0.50, 
0.60, 0.70, 0.80, 0.90, 0.95, 0.99] 


print(dataframe[numerical_col].describe(quantiles 


).T) 


numerical_col, 


Vv 


if plot: 
dataframe[numerical_col].hist(bins=20) 


plt.xlabel(numerical_col) 


plt.showQ) 


> 

> 

> 

> pit.title(numerical_col) 
> 

> for col in num_cols: 
> 


num_summary(df, col, plot=True) 
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Output: 

Quantity 
300000 
250000 
200000 
150000 
100000 
50000 
0 

0 50 100 150 200 250 300 
Quantity 
Figure 3: Summarizing Numerical Variable 
Quantity. 
invoiceDate 

40000 


2011-01 2011-03 2011-05 2011-07 2011-09 2011-11 
InvoiceDate 


Figure 4: Summarizing Numerical Variable 
Invoice Date. 


UnitPrice 
200000 
175000 
150000 
125000 
100000 
75000 
50000 
25000 
o > J - — * a 
a 5 10 15 20 25 3x0 = 
Unitence 


Figure 5: Summarizing Numerical Variable Unit 
Price. 


Interpretation: 

The above Bar plots represent the Count of Numerical 
Variables(Quantity, Invoice date, unit price) where we 
can observe more than 300000 transactions are made 
with (0-10) quantity and nearly 10000 transactions are 
made with (100-150) transactions made available in 
Dataset provided. More number of transactions are 


made in Q4. More than 200000 transactions are made 
with unit price (1-5). 


Python Code For How many sales for each 


product: 
> df_product — 


df.groupby("Description").agge({"Quantity":"count 
Ww }) 


> df_product.reset_index(inplace=True) 
> df_product 


Output: 
0 4 PURPLE FLOCK DINNER CANDLES 39 
i 50'S CHRISTMAS GIFT BAG LARGE 109 
2 DOLLY GIRL BEAKER 138 
3 1 LOVE LONDON MINI BACKPACK 7o 
4 T LOVE LONDON MINI RUCKSACK i 
3872 ZINC T-LIGHT HOLDER STARS SMALL 238 
3873 ZINC TOP 2 DOOR WOODEN SHELF 9 
3874 ZINC WILLIE WINKIE CANOLE STICK 192 
3875 ZINC WIRE KITCHEN ORGANISER i2 
3876 ZINC WIRE SWEETHEART LETTER TRAY 20 


Table 2: Quantity Sold for each product. 


Interpretation: 

The above table represents quantity of products sold in 
serial manner we can observe the product respective 
quantity available in Dataset provided. 


Python Code For Top 10 Products: 

> top_pr= 
df_product.sort_values(by="Quantity" ,ascending= 
False).head(10) 


> sns.barplot(x="Description", 
data=top_pr) 


> pit.xticks(rotation=90) 


> pit.show( 
Output: 


y="Quantity", 


3 


70 


| a 


o 


© e 3 = & # a 
te i @ S —€ & x 
= 2 § Sf # § & 
b % § z 2 # 2 
3 = 6 3 2 5 ge 
& 4 te ® 
= 


Deccriptvowy 


Figure 6: Top 10 Products With Respect To 
Quantity Sold. 


Interpretation: 

The above table represents Top 10 Products sorted 
with respect to Quantity in Dataset provided. We can 
observe product with description “WHITE 
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HANGING HEART- LIGHT HOLDER” stands Ist in 
Top 10 with more than 2000 transactions, “PACK OF 
72 RETROSPOT CAKE CASES” stands at 10th. 


IMPLEMENTATION OF MODEL FOR DATA 
ANALYSIS 
Steps for Model Implementation: 


> df["TotalPrice"] = 
df["Quantity"] 


# Determining the analysis date for the recency 


> df["InvoiceDate"] = 
pd.to_datetime(df["InvoiceDate"]) 


> df["InvoiceDate"].max() 
> today_date = dt.datetime(2011, 12, 11) 
# Generating RFM metrics 


> rfm = 
df.groupby(""CustomerID").agg({"InvoiceDate": 
lambda Date: (today_date- Date.max()).days, 


df["UnitPrice"] i 


"InvoiceNo": lambda Invoice: Invoice.nunique(), 
"TotalPrice": lambda TotalPrice: TotalPrice.sum( }) 


> rfm.columns = 


wow 


["recency","frequency","monetary"] 
> rfm.head() 


> rfm.describe().T 


i 210.44 
? A3i000 
4 TOTS 
I 149172 
1 331.46 


Table 3: RFM Metrics Generated for Each Customer ID. 


Wa. S00 


Tx 


SO 


18500 


io 62. 5.000 20.000 
W305 8659 Ehsesh = Melb S25 


Table 4: Statistical Description of RFM Metrics. 


A. Customer Segmentation With RFM 
1. Preparation of RFM Metrics. 
2. Generating REM Scores. 
3. Segmenting Customers Based on REM Scores. 
4. Visualization of RFM Segments. 
B. Customer Segmentation With CLTV 
1. Preparation-Data Structure of CLTV 
2. BG-NBD Model 
3. Gamma Gamma Model 
4. BG-NBD and GG Model For Prediction 
5. Segmentation on CLTV Forecasts 
A. Customer Segmentation With RFM: 
1. Python Code for preparation of RFM Metrics: 
# total price per invoice 
Output: 
CostameriD 

123460 326 

T7387.0 3 

12358.0 6 

12349.0 19 

1235h0 311 

repency. STO SRE IM, 
7 
Trequency § 43390 S712 77059 
monetary §=da330 TES a9 PMSA 6G 
GB bi? 

Interpretation: 


The above table represents top 5 instances of generated RFM(Recency, Frequency, Monetary) metrics in Dataset 
provided. And also Describes the RFM metrics(count, mean, std, min, max). 


Where we can observe “min of Monetary” is 0.0 & “max of Monetary” is 266163.525 


monetary, the min value of the total money paid can't be 0. 


Code: 

# let's remove them from the data 
> rfm=rfm[rfm["monetary"] > 0] 
> rfm.describe().T 
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Output: 


recency 4338.0 93,059474 300.0122 100 180000 510 142700 374.000 
fia 


frequerx 4338.0 4.272706 7N6221 100) bono 20 5,0000 210,000 
y 


monetary 4338.0 1892 1842 7706206 3.75 JORSO7S 663.1 WIL107S 266163525 
Of 805 


Table 5: Statistical Description of Corrected RFM Metrics. 


Interpretation: 
The above table represents “minimum of monetary value 0” is replaced with “minimum value greater than 0” 3.75 
in Dataset provided. 


> Python Code for Generating RFM Scores 

# recency_score 

> rfm["recency_score"] = pd.qcut(rfm['recency’'], 5, labels=[5, 4, 3, 2, 1]) 

# frequency_score 

> rfm["frequency_score"|=pd.qcut(rfm["frequency"].rank(method="first"), 5, labels=[1, 2, 3, 4, 5]) 
# monetary_score 

> rfm["monetary_score"] = pd.qcut(rfm["monetary"], 5, labels=[1, 2, 3, 4, 5]) 

# RFM Score 

> rfm["RFM_SCORE"]=(rfm["recency_score"].astype(str)+ rfm["frequency_score"].astype(str)) 

> rfm.head(10) 


Output: 
fetometG onicency | deequence  mosetary BBOEnCW Soom | Prequeecy acces || paonetary score PFN ECOAE 
ims ed ili i 1 i it 
res 7 1120) ‘" é & a 
ea t 10.4 i é t i 
wat = ig 1 14472 4 1 ‘ i 
mot a4 Le L 1 t Hy 
TSaet af Ei Whe 4 L & i} 
et 1 m0) L 1 1 i 
rit a a nea L 1 ‘ i 
ime ed aot L 4 t i 
a 4 abn a3 4 4 5 a 
Table 6: Generated RFM Scores for Each Customer ID. 
Interpretation: 


The above table represents 10 instances of generated RFM scores in Dataset provided. We can observe the RFM 
scores of each “Customer ID” individualy. 


> Python Code for Segmenting Customers Based on RFM Scores 
seg_map = { 

r'[1-2][1-2]': hibernating’, 

r'[1-2][3-4]': 'at_Risk', 


r'[1-2]5': 'cant_loose’, 


VV VV WV 


r'3[1-2]': 'about_to_sleep', 
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> 1'33': 'need_attention’, 

> 1'[3-4][4-5]': loyal_customers'’, 

> 1'41': ‘promising’, 

> 1'51': 'new_customers'’, 

> 1'[4-5][2-3]': 'potential_loyalists', 

> 1'5[4-5]': 'champions' 

> } 

> rfm['segment'] = rfm['RFM_SCORE'].replace(seg_map, regex=True) 

> rfm.head(10) 

Output: 
12460 326 1 30044 4 | 2 li hrhemnebog 
127.0 3 7 4110.00 5 5 $ SS champeons 
12M8 6 i] ‘ 17078 2 é 4 24 ot Rsk 
LH4I0 13 } 149h72 7 i : 4) promeing 
1240.0 an 1 146 i 1 2 +h hiberneting 
123520 37 8 TAs 3 s 4 % foyal_custor 

orn 

1235409 5 1 39.00 1 1 1 ll hbornating 
123540 33 1 1073.49 i i 4 1) hlenamg 
124550 75 1 459.40 i 1 2 li hoernating 
125500 ij 5 Z41L43 t 5 4) potential ly 
Table 7: Customers Segmentation based on senerited RFM< scores. 

Interpretation: 


The above table represents Top 10 instances of Customers Segmented based on generated RFM scores in Dataset 
provided. We can observe the Customers Segmented based on generated RFM scores of each “Customer ID” 
individualy. 


Python Code for grouping RFM mean and count values according to segments 


> rfm[["segment","recency","frequency","Mmonetary"]].groupby("segment").agg(["mean", "count"]) 
Output: 
Terie rely recueny Mca 
meen meet 
boat to weep S300 253 Links EST 50ST Tht. 
at_Ais, ISA TRRIS $54 2573582 Ere DIR SSL 5o3 
cant Jone 1a ae i} Ba bry 63 Pim ere ie | fa 
chang hons 6361 ia 12 41te bai GA52 612578 Bad 
Pilenadting Ss Lore LOL Te Lord PIR STaG Ira 
keyed customer: SAS a3 Lt GAyseST BL STSL ELS BRS 
need atberbon Shara 197 2aena AE: Bay. bose 1 
Phra te prac: TABS Ar LS 4z Fie i 
potential boyalets- GaSe tts 2010031 pe. GPR bREISE fit 
Pireiiee kag: 225 bondo a I Ean Ly] aS ods PTS | 
Table 8: RFM mean and count values according to Customer Segments 
Interpretation: 


The above table represents Count, Mean of RFM Segments generated RFM scores in Dataset provided. We can 
observe number of Hibernating customers is “1071”, Loyal_customers is “819”, Champions customers is “633”, 
At_Risk customers is “593”, Potential_loyalists customers is “484”, About_to_sleep customers is “352”, 
Need_attention customers is “187”, Promising customers is “94’’, Cant_loose customers is “63”. 
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> Python code for Visualization of RFM Segments 
e sgm=rfm["segment"].value_counts() 
e = pit.figure(figsize=(10,7)) 
e = sns.barplot(x=sgm.index,y=sgm.values) 
e = plit.xticks(rotation=45) 
e = plt.title(‘Customer Segments',color = 'blue',fontsize=15) 


e = plt.show( 


Output: 
Customer Segre nts 7 
ais 
- A 
eo - - he sf wr roa at ‘i 
aw Pl Fol + ae : of Ps Pa « rs : e a 
Pd rad » a « 
- a 
Figure 7: Bar Plot Of Customer Segmentation Based On RFM Scores. 
Interpretation: 


The above bar plot represents Visualization of Customer Segmentation based on RFM scores generated in Dataset 
provided. We can observe graph showing descending bar plot of Hibernating customers is “1”, Loyal_customers 
is “2”, Champions customers is “3”, At_Risk customers is “4”, Potential_loyalists customers is “5”, 
About_to_sleep customers is “6”, Need_attention customers is “7”, Promising customers is “8”, Cant_loose 
customers is “9’’, New_customers is “10”. 


Python Code forTreemap Visualization 
> df_treemap = rfm.groupby(‘segment').agg(‘count').reset_index() 
> df_treemap.head() 


Output: 
wprent recency | frequency | monetary ‘recestcy s¢| Irequercy | monetary | RFM_SCO 
oe corn more RE 
© wouttosk 352 382 392 352 382 452 382 
2p 

1 at_fisk 593 $93 $93 593 993 $93 $93 

2 tamtcose 63 63 83 63 63 63 8 

3 camglons «= 633 633 633 63 633 633 633 

4 ibeenating «= 107) 1071 1071 2071 1073 1071 1071 

Table 9: Customer Segmentation Count Based On RFM Scores. 

Interpretation: 


The above table represents count of Customer Segmentation based on RFM scores generated in Dataset provided. 
We can observe that number of about_to_sleep is 352, at_Risk is 593, cant_loose is 63, champions is 633, 
hibernating is 1071. 


Python Code forPlotting Treemap 
> fig, ax = plt.subplots(1, figsize = (16,10)) 


> squarify.plot(sizes=df_treemap['RFM_SCORE'J, 
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label=df_treemap|['segment’], 
alpha=.8, 
color=['tab:red', 'tab:purple', 'tab:brown’, 'tab:pink’, 'tab:gray’] 
) 
> pit.axis(off') 
> pit.show( 
> pilt.savefig('treemap.png') 
Output: 


Figure 8: Tree Map Visualization Of Customer Segmentation. 


Interpretation: 
The above visualization represents tree map of Customer Segmentation based on REM scores generated in 
Dataset provided. We can observe the Customer segmentation in a glance by using Tree Map. 


A. Customer Segmentation With CLTV: 

1. Python Code for Preparation-Data Structure of CLTV 
# Determining the analysis date for the recency 

> df["InvoiceDate"] = pd.to_datetime(df["InvoiceDate"]) 

> df["InvoiceDate"].max() 

> today_date = dt.datetime(2011, 12, 11) 

# Generating CLTV metrics 


> cltv_df = df.groupby(‘CustomerID').agg({'InvoiceDate': [lambda date: (date.max() - date.min()).days, 
lambdadate:(today_date-date.min()).days], 

‘InvoiceNo': lambda num: num.nunique(), 

'TotalPrice': lambda TotalPrice: TotalPrice.sum() }) 

> cltv_df.columns = cltv_df.columns.droplevel(0) 

> cltv_df.columns = ['recency’, 'T’, ‘frequency’, 'monetary'] 

> cltv_df-head() 


Output: 
Customer rRCERCY a8 reget y 
Leo q ah L moe 
T410 365 Bs i 31000 
UBD ae et 4 UME 
4a q 4 l H9LT2 
asta il Bue 


‘4 i i 
Table 10: RFTM Values of Each Customer ID. 
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Code: 
# we calculated the monetary values as the Total price. 
# At this point we will express the monetary value as the average earnings per purchase. 
> cltv_df["monetary"] = cltv_df["monetary"] / cltv_df["frequency"] 
# selection of monetary values greater that Zero. 
> cltv_df = cltv_df[cltv_df["monetary"] > 0] 
# Weekly expression of Recency and T for BGNBD 
> cltv_df["recency"] = cltv_df["recency"] / 7 
> cltv_df["T"] = cltv_df["T"]/7 
# Selecting Frequency greater than 1 
> cltv_df = cltv_df[(cltv_df['frequency’] > 1)] 
> cltv_df.head() 


Output: 
CustomeriD recency T frequency 

12347.0 52.142857 §2.571429 7 615.714286 
12348.0 40285714 51285714 4 442695000 
12352.0 37142857 42428571 8 239.542500 
12356.0 43142857 46.571429 3 937,143333 
12358.0 21.285714 21571429 2 575.210000 
Table 11: Updated RFTM Values of Each Customer ID. 

Interpretation: 


The above table represents data structure prepared for generating CLTV based on REM Metrices. Here we have 
added a new columns ““T” which represents “Tenure” of every individual cutomer ID. 


> Python Code for BG-NBD Model 

# For modelling, we first fit our Frequency, Recency and T columns to BG/NBD model. 
> bef = BetaGeoFitter(penalizer_coef=0.001) 

> bef.fit(cltv_df['frequency’], 

> cltv_df['recency’], 

> cltv_df['T']) 

# 1 week expected purchase (transaction) 

> cltv_df["expected_purc_1_week"] = bef.predict(1, 

cltv_df['frequency’], 

cltv_df['recency’], 

cltv_df['T']) 

> cltv_df.sort_values("expected_purc_1_week", ascending=False).head(10) 


@ IJTSRD | Unique Paper ID-ITSRD51952 | Volume-—6 | Issue—6 | September-October 2022 Page 753 


International Journal of Trend in Scientific Research and Development @ www.ijtsrd.com eISSN: 2456-6470 
Output: 


Cyihenenty | | netmiy 
My BUR fret BER Bl -chie iH Be LR Fa UTS 
ea. shee Fil SAEs A TG Pe Rieck 
Irae $1,000 53,4851 iu IHLI3A756 108023 
1440 eee ee, a 4} Po Aen ber ipeys.| 
Weta Shea SALES a PMT 1a 
aia SR ESTIA ao Gk me | 1airt? 
ato sana 0ST oa aT LIST 
15th. SOuIEST ST i ig AAD 1n5H 
eet, {Loi SABIE fit 8, Ses Ties 
ii. $2 28ST Tha tts fi 149.775908 Stag? 
Table 12 : Expected Purchase Rate For 1 Week, Generated by Fitting BG-NBD Model based on RFM 
Metrices. 


Interpretation: 

The above table represents expected purchase rate for 1 week, generated by fitting BG-NBD Model based on 
REM Metrices. We can observe the expected purchase for 1 week of every individual with respective “Customer 
Python code for 1 month expected purchase 

> cltv_df["expected_purc_1_month"] = bef.predict(4, cltv_df['frequency'], cltv_df['recency’], cltv_df['T']) 

> cltv_df.sort_values("expected_purc_1_month", ascending=False).head(10) 


Output: 
LQMRD SRIGDES? SANS SHLD 325388 13 0256% 
19901,9  SRINHST SRILA THCINO 312s 1267985 
MLD SROGOND SEAMSTL «= 3301S Laas TMS 
13822 SLIT $2353 HTN LS37a28 122A55 
1X) SRI? SRANSTL LIND 163935 <0 
1531.9 SRUESTIN SRAUSTL «= GTM Le3377 £71967 
Ld « SESTIGN: «Sheer IA if 127 SB5EI? L3Snd Saat 
1469  SUSIEST: SLTHI8G SSE 1250) L5H) 
LKB = SRGCOOD).-SASDESTL RR SISNES 0 WMS 19833 
1D SLNSTIN SESS (39R TRONR 0988503 263855 

Table 13 : Expected Purchase Rate For 1 Month, Generated by Fitting BG-NBD Model based on RFM 

Metrices. 
Interpretation: 


The above table represents expected purchase for | month, generated by fitting BG-NBD Model based on RFM 
Metrices. 


> Python Code for Gamma Gamma Model 

# For modelling, we first fit our Frequency, Recency columns to Gamma Gamma Submodel 
> gef =GammaGammafFitter (penalizer_coef=0.01) 

> gef.fit(cltv_df['frequency'], cltv_df['monetary']) 

# Expected average profit 

> cltv_df["expected_average_profit"]=gef.conditional_expected_average_profit 
(cltv_df['frequency'], cltv_df['monetary']) 

> cltv_df.sort_values("expected_average_profit", ascending=False).head(20) 
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Output: 
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Table 14 : Expected Average Profit, Generated by Fitting BG-NBD Model based on RFM Metrices. 


Interpretation: 
The above table represents expected average profit, which in other words Customer Acquistion Cost generated by 
fitting Gamma Gamma Submodel based on RFM Metrices. 


> Python Code for BG-NBD and GG Model For Prediction 
e cltv = ggf.customer_lifetime_value(bef, 

e cltv_df['frequency’], 

e cltv_df['recency’], 

© cltv_df['T'], 

e  cltv_df['monetary’], 

e time=6, # 6 months. 

e = freq="W", # T's Frequency information. 

e discount_rate=0.01) 

# Reset index 

e  cltv = cltv.reset_index() 

# Merging the main table and the forecast values table 

e = cltv_final = cltv_df.merge(cltv, on="Customer ID", how="left") 
# sorting 

e  cltv_final.sort_values(by="clv", ascending=False).head(10) 


Output: 
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Table 15 : CLTV Generated by Fitting BG-NBD Model based on RFM Metrices. 
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Interpretation: 


> The above table represents CLV, generated by BG-NBD and GG Model Prediction based on RFM Metrices. 


Python Code for 12 Month CLTV Prediction: 


# 12 Month CLTV Forecast: 
> 


cltv_12 = ggf.customer_lifetime_value(bef, 


> cltv_df['frequency’], 

> cltv_df['recency’'], 

> cltv_df['T'], 

> cltv_df['monetary'], 

> time=12, #1 aylik 

> freq="W", # T'nin frekans bilgisi 

> discount_rate=0.01) 

> cltv_12.head() 

> cltv_12 =cltv_12.reset_index() 

> cltv_12 =cltv_df.merge(cltv_12, on="CustomerID", how="left") 

> cltv_12.sort_values(by="clv", ascending=False).head(10) 

Output: 

epectec_ew 
erage prov@t 

1172?) 3MdEO SOAPRSTE SOT RaDae MoO ew 1399817 AAMC BR WIGS 2906 978899 
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Table 16 : 12 Month CLTV Forecast, Generated by Fitting BG-NBD Model based on RFM Metrices. 

Interpretation: 


The above table represents 12 Month CLTV Prediction, generated by BG-NBD and GG Model Prediction based 
on RFM Metrices. 


> Python Code for Segmentation on CLTV Forecasts 
Python code for Normalization 0-1 Range For CLV Values 


> scaler = MinMaxScaler(feature_range=(0, 1)) 
> scaler.fit(cltv_final[["clv"]]) 
> cltv_final["scaled_clv"] = scaler.transform(cltv_final[["clv"]]) 
> cltv_final.sort_values(by="scaled_clv", ascending=False).head() 
Output: 
wpected_p | pected capecied a dee ch 
erm_l_ weel| perc] verage_pro 
marth fr 
tH22 19586, S042 STE  SSSEAE LzRDNT? CATERED JRL? oke 
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Table 17 : Scaled CLTV Rate, Generated by Fitting BG-NBD Model based on RFM Metrices. 
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Interpretation: 
The above table represents scaled CLV values generated based on RFM metrices in Dataset provided. 


Python code for Segmentation of Customers 

> cltv_final["segment"] = pd.qcut(cltv_final["scaled_clv"], 4, labels=["D", "C", "B", "A"]) 
> cltv_final.head() 

> cltv_final.head() 


Output: 
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Table 18 : Customer Segmentation based on CLTV Forecasted Values. 
Interpretation: 


The above table represents labelling segments of based on scaled CLV generated in Dataset provided. 
Python code for Examination of Segments by count 


> cltv_final.groupby("segment").agg({"count" }) 


Output: 
jegrinel | Custoreer | recenry frequeer mioretar expected | eaperiad |Lepected | ch 
10 y 7] “pere_l_ | port 1 | nerdige 
ane ro | profit 
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Table 19 : Count of Customer Segments based on CLTV Forecasted Values. 
Interpretation: 


The above table represents the count of Segment labels based on CLV generated in Dataset provided. 
Python code for Examination of Segments by sum 

> cltv_final.groupby("segment").agg({"sum"}) 

Output: 
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Table 20 : Sum of Customer Segments based on CLTV Forecasted Values. 
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Interpretation: 
The above table represents the sum of Segment labels based on CLV generated in Data-set provided. 


Python code for Examination of Segments by mean 


> cltv_final.groupby("segment").agg({"mean" }) 


Output: 
| scaled. cl 
}M 
ite irae Alan foben (rasa meas mean fepan pan TKR 
p ISS BLOSTA 4DATI7S ROGAN] 126137 GO71I5 D282 1990GPD 2PRSI7b Codzsea 
1635 4 o ee] i] it] 
c 1534.79 J05RRES JES AAT) GvlS0s. CLS 24S BB TS Oe 
i an £ iil te a 
B IS3S25 P46 ULSD) SASH? STaS2e Ce! (Besse Oe Es 528) 
ma 4 r| te] di, bi? 
é TSAR SLSIS7L JL8807] LLG GiBMID 276M 108) ATM? Gg ONaeTeS 
a | 1 b i m uaa 
Table 21 : Mean of Customer Segments based on CLTV Forecasted Values. 
Interpretation: 


The above table represents the sum of Segment labels based on CLV generated in Dataset provided. 


FINDINGS OF THE STUDY segments. 


United Kingdom, Least number of transactions are 
made from Austria. 


Highest number of transactions are made between 


> Most number of the transactions are made from > 


Based on CLTV metrics Segments “D” group 
customers generate least CLTV than any other 
segments. 


> Based on CLTV metrics Segments “A” grou 
dates (2011/09 to 2011/11). g ore 
customers generate more Frequency than any 
Top 10 products being bought by the Customers. other segments. 
Based on RFM metrics Number of Hibernating > Based on CLTV metrics Segments “A” group 
customers is “1071”. customers generate more Expected Average Profit 
Loyal_customers count based on RFM metrics is than any other segments. 
“819”, SUGGESTIONS 
Champions customers count based on RFM ge Runniste Customer a ¢ capac tor © 
wmetice is “6000 segment customers can increase more CLV and 
; Customer Acquisition Cost (CAC) can be reduced 
At_Risk customers count based on REM metrics when compared with “‘D” segment customers. 
is “593”. ; : 
> Running Loyalty based campaigns for “A” 
Potential_loyalists customers count based on RFM segment customers can help generate more CLV 
metrics is “484”. and reduce Customer Acquisition Cost (CAC) 
About_to_sleep customers count based on RFM when compared with “B” segment customers. 
metrics is “352”. > Concentrating on customer who are Hibernating 
Need_attention customers count based on RFM customers, About_to_sleep customers can help in 
meiosis 187" improving “Customer Retention’”’. 
Promising customers count based on RFM > Concentrating on customer — who ate 
metrices is “94”. About_to_sleep customers, Need_attention 
customers can help in reducing “Customer 
Cant_loose customers count based on RFM Attrition”. 
metrices is “63”. 
. CONCLUSION 
New_customers count based on RFM metrics is > CLTV model helps you determine how much 


497? 


Based on CLTV metrics Segments “B” group 
customers generate most CLTV than any other 


money you can afford to spend acquiring new 
customers and retaining existing ones and RFM 
can be used to segment your customers to better 
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target your marketing efforts. All the efforts to 
perform these analyses have only one goal which 
is to make better business decisions based on data. 
Using CLTV and REM simultaneously and 
interpreting data based on both analyses can help 
businesses grow. 


Performing these analyses using Python and 
Lifetimes modules is not the most important and 
complicated part. Knowing your business, having 
domain knowledge and using that knowledge to 
produce useful results from these analyses is the 
key to a business’s success. 


It’s not wise to serve all customers with the same 
product model, email, text message campaign, or 
ad. Customers have different needs. A one-size- 
for-all approach to business will generally result in 
less engagement, lower-click through rates, and 
ultimately fewer sales. Customer segmentation is 
the cure for this problem. 


Finding an optimal number of unique customer 
groups will help you understand how your 
customers differ, and help you give them exactly 
what they want. Customer segmentation improves 
customer experience and boosts company revenue. 
That’s why segmentation is a must if you want to 
surpass your competitors and get more customers. 
Doing it with machine learning is definitely the 
right way to go. 
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